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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: NOVO NQRDISK A/S, N N 
(ii) TITLE OF INVENTION: A Cellulase Preparation 
10 (iii) NUMBER OF SEQUENCES: 4 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NOVO NQRDISK A/S, Patent Department 

(B) STREET: Novo Alle 
15 (C) CITY: Bagsvaerd 

(E) COUNTRY: DENMARK 

(F) ZIP: DK-2880 

(V) OCMEUTER READABLE FORM: 
20 (A) MEDIUM TYPE: Floppy disk 

(B) OCMEUTER: IEM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOF1WARE: Patentln Release #1-0, Version #1.25 

25 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

3 0 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Thalsoe-Madsen, Birgit 

(ix) TELEOCMMUNICATION INFORMATION: 
(A) TELEPHONE: +45 4444 8888 
35 (B) TELEFAX: +45 4449 3256 

(C) TELEX: 37304 



(2) INFORMATION FOR SEQ ID NO:l: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 1060 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 
45 (D) T0P0IOGY: linear 

(ii) MOIECUIE TYPE: cENA 

(iii) HYPOTHETICAL: NO 

50 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola insolens 

(B) STRAIN: DSM 1800 
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(ix) FEAOTKE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 73.. 927 

5 (ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) IOCSTTON: 10 ,.72 

(ix) FEATURE: 

10 (A) NAME/KEY: CDS 

(B) IXXAIICN: 10.. 927 



15 



(xi) SEQUENCE DESCRIFI'ION: SEQ ID N0:1: 

GGATCCAAG AUG OCT TCC TOC OOC CTC CTC COG TCC GOC CTT GIG GCC 48 
Met Arg Ser Ser Pro leu leu Pro Ser Ala Val Val Ala 
-21 -20 -15 -10 

20 GOC CTC COG GIG TIG GOC CTT GOC GOT GAT GGC AGG TOC AOC OGC TAC 96 
Ala leu Pro Val leu Ala leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr 
-5 15 

TGG GAC TGC TGC AAG OCT TOG TGC GGC TGG GOC AAG AAG GCT CCC GIG 144 
25 Trp Asp Cys Cys Lys Pro Ser Cys Gly Trp Ala lys lys Ala Pro Val 
10 15 20 

AAC CAG OCT CTC TIT TOC TGC AAC GCC AAC TIC GAG OCT ATC AOG GAC 192 
Asn Gin Pro Val Hie Ser Cys Asn Ala Asn Hie Gin Arg lie Thr Asp 
3 0 25 30 35 40 

TTC GAC GOC AAG TOC GGC TGC GAG COG GGC GCT GTC GOC TAC TOG TGC 240 
Hie Asp Ala Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys 
45 50 55 

35 

GCC GAC CAG AOC OCA TGG GCT GTG AAC GAC GAC TTC GOG CTC GCT TIT 288 
Ala Asp Gin Thr Pro Trp Ala Val Asn Asp Asp Fhe Ala leu Gly Fhe 
60 65 70 

40 GCT GOC AOC TCT AIT GOC GGC AGO AAT GAG GOG GGC TGG TGC TGC GCC 336 
Ala Ala Thr Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys cys Ala 
75 80 85 

TGC TAC GAG CTC AOC TTC ACA TOC GCT OCT GIT GCT GGC AAG AAG ATC 384 
45 Cys Tyr Glu leu Thr Hie Thr Ser Gly Pro Veil Ala Gly Lys Lys Met 
90 95 100 

GTC GTC CAG TCC AOC AGO ACT GGC GCT GAT CTT GGC AGO AAC CAC TTC 432 
Val Val Gin Ser Thr Ser Thr Gly Gly Asp leu Gly Ser Asn His Hie 
50 105 110 115 120 

GAT CTC AAC ATC OOC GGC GGC GGC CTC GGC ATC TTC GAC GGA TGC ACT 480 
Asp leu Asn lie Pro Gly Gly Gly Val Gly lie Phe Asp Gly Cys Thr 
125 130 135 

55 
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CCC CAG TIC GGC GCT CIG OCC GGC CAG OGC TAG QGC GGC ATC TOG TOC 528 
Pro Gin Fhe Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly lie Ser Ser 
140 145 150 

5 OGC AAC GAG TCC GAT OGG TTC OOC GAC GOC CTC AAG OOC GGC TGC TAG 576 
Arg Asn Glu Cys Asp Arg Fhe Pro Asp Ma Leu lys Pro Gly Cys Tyr 
155 160 165 

TOG CCC TTC GAC TOG TIC AAG AAC GOC GAC AAT COG AGC TTC AGC TTC 624 
10 Trp Arg Fhe Asp Trp Fhe lys Asn Ala Asp Asn Pro Ser Fhe Ser Fhe 
170 175 180 

OCT CAG GTC CAG TCC OCA GOC GAG CTC GTC GCT OGC AOC GGA TCC OGC 672 
Arg Gin Val Gin Cys Pro Ala Glu leu Val Ala Arg Thr Gly Cys Arg 
15 185 190 195 200 

OGC AAC GAC GAC GGC AAC TTC OCT GOC GTC CAG ATC OOC TOC AGC AGC 720 
Arg Asn Asp Asp Gly Asn Hie Pro Ala Val Gin He Pro Ser Ser Ser 
205 210 215 

20 

ACC AGC TCT COG GTC AAC CAG OCT AQC AGC AOC AGC AOC AOG TCC AOC 768 
Thr Ser Ser Pro Val Asn Gin Pro Hir Ser Thr Ser Thr Thr Ser Thr 
220 225 230 

25 TOC AOC AOC TOG AGC COG OCA GTC CAG OCT AOG ACT OOC AGC GGC TGC 816 
Ser Hit Bit Ser Ser Pro Pro Val Gin Pro Thr dr Pro Ser Gly Cys 
235 240 245 

ACT GCT GAG AGG TOG GCT CAG TGC GGC GGC AAT GGC TOG AGC GGC TGC 864 
3 0 Hit Ala Glu Arg Trp Ala Gin cys Gly Gly Asn Gly Trp Ser Gly Cys 
250 255 260 

AOC AOC TCC GTC GCT GGC AGC ACT TCC AOG AAG ATT AAT GAC TOG TAC 912 
Uir Thr cys Val Ala Gly Ser Thr Cys Thr Lys He Asn Asp Trp Tyr 
35 265 270 275 280 

CAT CAG TGC CTC TAGAOGCAGG GCAGCITGAG GGOCTTACTG GTGGCOGCAA 964 

His Gin cys Leu 

285 

40 

CGAAATGACA CTOOCAATCA CTGTATTAGT TCTTGTACAT AA3TTOCTCA TOOCTOCAGG 1024 
GATIGTCACA TAAATCCAAT GAGGAACAAT GACTAC 1060 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE OCVRACTERISTICS: 

(A) IENGEH: 305 axaino acids 
5 (B) TYPE: amino acid 

(D) TOFOIOGY: linear 

(ii) ISDUECJLE TYPE: protein 

10 (xi) SEQUENCE DESCKEFTION: SEQ ID NO:2: 

Met Arg Ser Ser Pro Leu Leu Pro Ser Ala Val Val Ala Ala leu Pro 
-21 -20 -15 -10 

15 Val Leu Ala Leu Ala Ala Asp Gly Arg Ser Thr Arg Tyr Trp Asp Cys 
-5 1 5 10 

cys Lys Pro Ser Cys Gly Trp Ala Lys lys Ala Pro Val Asn Gin Pro 
15 20 25 

20 

Val Fhe Ser Cys Asn Ala Asn Phe Gin Arg lie Thr Asp Fhe Asp Ala 
30 35 40 

Lys Ser Gly Cys Glu Pro Gly Gly Val Ala Tyr Ser Cys Ala Asp Gin 
25 45 50 55 

Thr Pro Trp Ala Val Asn Asp Asp Fhe Ala leu Gly Fhe Ala Ala Thr 
60 65 70 75 

30 Ser lie Ala Gly Ser Asn Glu Ala Gly Trp Cys Cys Ala Cys Tyr Glu 

80 85 90 

Leu Thr Hie Thr Ser Gly Pro Val Ala Gly lys Lys Met Val Val Gin 
95 100 105 

35 

Ser Thr Ser Thr Gly Gly Asp Leu Gly Ser Asn His Fhe Asp Leu Asn 
110 115 120 

lie Pro Gly Gly Gly Val Gly He Fhe Asp Gly cys Thr Pro Gin Fhe 
40 125 130 135 

. Gly Gly Leu Pro Gly Gin Arg Tyr Gly Gly He Ser Ser Arg Asn Glu 
140 145 150 155 

45 Cys Asp Arg Fhe Pro Asp Ala Leu Lys Pro Gly Cys Tyr Trp Arg Fhe 

160 165 170 

Asp Trp Fhe Lys Asn Ala Asp Asn Pro Ser Fhe Ser Fhe Arg Gin Val 
175 180 185 

50 

Gin Cys Pro Ala Glu Leu Val Ala Arg Thr Gly Cys Arg Arg Asn Asp 
190 195 200 

Asp Gly Asn Fhe Pro Ala Val Gin He Pro Ser Ser Ser Thr Ser Ser 
55 205 210 215 
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Pro Val Asn Gin Pro Thr Ser Bir Ser lhr Uir Ser Ttxr Ser Ihr Thr 
220 225 230 235 

5 Ser Ser Pro Pro Val Gin Pro Ihr Ihr Pro Ser Gly Cys Uir Ala Glu 
240 245 250 



10 



Arg Trp Ala Gin cys Gly Gly Asn Gly Trp Ser Gly Cys T5ir Ihr Cys 
255 260 265 

Val Ala Gly Ser Hxr Cys Uir lys lie Asn Asp Trp Tyr His Gin Cys 
270 275 280 



leu 



15 
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(2) INFOEMAITON FOR SBQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENCTH: 1473 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOIOGY: linear 

(ii) MOIECUIE T2PE: cENA 
(iii) HXPULLHLT1CAL: NO 
(iv) ANTI-SENSE: NO 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rosarium oxysporum 

(B) STRAIN: DSM 2672 

(ix) FEAIURE: 
20 (A) NAME/KEY: CDS 

(B) LOCATION: 97., 1224 



10 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GAATTOGOGG COGCTCATTC ACTTCAIITCA TTCITTAGAA TTACATACAC TCTCITTCAA 60 



AACACTCACT CTITAAACAA AACAACITTT GCAACA ATG OGA TCT TAC ACT CTT 114 

Met Arg Ser Tyr Thr Leu 
30 15 

CTC GCC CTG GCC GGC OCT CDC GCC GTC ACT GCT GCT TCT GGA AGC QCT 162 
leu Ala Leu Ala Gly Pro Leu Ala Val Ser Ala Ala Ser Gly Ser Gly 
10 15 2!0 

35 

CAC TCT ACT OGA TAC TGG GAT TGC TGC AAG OCT TCT TGC TCT TGG AGC 210 
His Ser Hit Arg Tyr Trp Asp Cys Cys Lys Pro Ser cys Ser Trp Ser 
25 30 35 

40 GGA AAG GCT GCT GTC AAC GCC OCT GCT TEA ACT TCT GAT AAG AAC GAC 258 
Gly Lys Ala Ala Val Asn Ala Pro Ala leu Thr cys Asp Lys Asn Asp 
40 45 50 

AAC CCC AIT TCC AAC ACC AAT GCT CTC AAC GCT TCT GAG GCT GCT GCT 306 
45 Asn Pro lie Ser Asn Hhr Asn Ala Val Asn Gly Cys Glu Gly Gly Gly 
55 60 65 70 

TCT GCT TAT GCT TCC ACC AAC TAC TCT CCC TGG GCT GTC AAC GAT GAG 354 
Ser Ala Tyr Ala Cys Thr Asn Tyr Ser Pro Trp Ala Val Asn Asp Glu 
50 75 80 85 

CTT GCC TAC GCT TTC GCT GCT ACC AAG ATC TCC GCT GGC TCC GAG GCC 402 
Leu Ala Tyr Gly Rie Ala Ala Ttir Lys lie Ser Gly Gly Ser Glu Ala 
90 95 100 

55 
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AGC TOG TGC TCT GCT TCC TAT GCT TIG AOC TTC AOC ACT GGC COC GTC 450 
Ser Trp Cys Cys Ala Cys Tyr Ala Leu Ihr Fhe Hit Ihr Gly Pro Val 
105 110 115 

5 AAG GGC AAG AAG ATC ATC CTC CAG TOC AOC AAC ACT GGA GGT GAT CTC 498 
Lys Gly Lys Lys Met lie Val Gin Ser Ihr Asn Ihr Gly Gly Asp leu 
120 125 130 

GGC GAC AAC CAC TTC GAT CTC ATC ATC COC GGC GCT GCT CTC GCT ATC 546 
10 Gly Asp Asn His Ehe Asp Leu Met Met Pro Gly Gly Gly Val Gly lie 
135 140 145 150 

TTC GAC GGC TGC AOC TCT GAG TTC GGC AAG GCT CTC GGC GCT GOC GAG 594 
Phe Asp Gly Cys Ihr Ser Glu Fhe Gly lys Ala Leu Gly Gly Ala Gin 
15 155 160 165 

TAC GGC GCT ATC TCC TCC OGA AGC GAA TCT GAT AGC TAC CCC GAG CXT 642 
Tyr Gly Gly lie Ser Ser Arg Ser Glu Cys Asp Ser Tyr Pro Glu Leu 
170 175 180 

20 

CTC AAG GAC GCT TGC CAC TOG OGA TTC GAC TGG TTC GAG AAC GCC GAC 690 
Leu Lys Asp Gly Cys His Trp Arg Fhe Asp Trp Fhe Glu Asn Ala Asp 
185 190 195 

25 AAC OCT GAC TTC AOC TTT GAG GAG GTT GAG TGC CCC AAG GCT CTC CTC 738 
Asn Pro Asp Fhe Ihr Rie Glu Gin Val Gin cys Pro Lys Ala Leu Leu 
200 205 210 

GAC ATC ACT GGA TCC AAG OCT GAT GAC GAC TCC AGC TTC OCT GOC TTC 786 
3 0 Asp He Ser Gly Cys lys Arg Asp Asp Asp Ser Ser Fhe Pro Ala Fhe 
215 220 225 230 

AAG GIT GAT ACC TOG GOC AGC AAG CCC GAG COC TCC AGC TOC GCT AAG 834 
Lys Val Asp Ihr Ser Ala Ser lys Pro Gin Pro Ser Ser Ser Ala Lys 
35 235 240 245 

AAG ACC AOC TCC GCT GCT GCT GOC GCT GAG COC GAG AAG ACC AAG GAT 882 
Lys Ihr Ihr Ser Ala Ala Ala Ala Ala Gin Pro Gin lys Ihr Lys Asp 
250 255 260 

40 

TOC GCT OCT GTT GTC GAG AAG TOC TCC AOC AAG CCT GCC GCT GAG COC 930 
Ser Ala Pro Val Val Gin Lys Ser Ser Ihr Lys Pro Ala Ala Gin Pro 
265 270 275 

45 GAG OCT ACT AAG COC GOC GAC AAG CCC GAG ACC GAC AAG OCT CTC GCC 978 
Glu Pro Ihr Lys Pro Ala Asp Lys Pro Gin Ihr Asp Lys Pro Val Ala 
280 285 290 

ACC AAG OCT GCT GCT AOC AAG COC GTC GAA OCT GTC AAC AAG CCC AAG 1026 
50 Ihr lys Pro Ala Ala Ihr Lys Pro Val Gin Pro Val Asn Lys Pro Lys 
295 300 305 310 

ACA AOC CAG AAG GTC OCT GGA AOC AAA AOC OGA GGA AGC TCC COG GOC 1074 
Ihr Ihr Gin Lys Val Arg Gly Ihr Lys Ihr Arg Gly Ser Cys Pro Ala 
55 315 320 325 
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AAG ACT GAC GCT AOC GCC AAG GOC TOC GOT CTC OCT GCT TAT TAG CAG 1122 
Lys ihr Asp Ala Hhr Ala Lys Ala Ser Val Val Pro Ala Tyr Tyr Gin 
330 335 340 

5 TCT GCT GCT TOC AAG TOC GCT TAT OOC AAC GGC AAC CTC GCT TGC GCT 1170 
cys Gly Gly Ser Lys Ser Ala Tyr Pro Asn Gly Asn Leu Ala Cys Ala 
345 350 355 

ACT GGA AGC AAG TCT GTC AAG CAG AAC GAG TAG TAC TOC CAG TCT CTC 1218 
10 TZir Gly Ser Lys Cys Val Lys Gin Asn Glu Tyr Tyr Ser Gin Cys Val 
360 365 370 

OOC AAC TAAATGGTAG AT0CAT0GGT TCTGGAAGAG ACTATGOGTC TCAGAAGGGA 1274 
Pro Asn 
15 375 

T0CTCTCATG AGCAGGCTIG TCATTCTATA GCATGGCATC CTGGACCAAG TCTTOGAOOC 1334 
TICTTCTACA TAGTATATCT TCATTCTATA TATTTAGACA CATAGATAGC CTCTTCTCAG 1394 

20 

OGACAACTQG CTACAAAAGA CTTGGCAGGC TICTTCAATA TTGACACACT TTCCTCCATA 1454 



AAAAAAAAAA AAAAAAAAA 



1473 
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(2) INFORMATION FOR SBQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

1Q ( i:L ) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 4: 

Met Arg Ser Tyr Thr Leu Leu Ala Leu Ala Gly Pro Leu Ala Val Ser 
15 5 10 15 

Ala Ala ser Gly Ser Gly His Ser Thr Arg Tyr Trp Asp cys Cys Lys 



25 30 



S 2 0 



Pro Ser Cys Ser Trp Ser Gly Lys Ala Ala Val Asn Ala Fro A!a leu 

40 45 

^ ^ *« Asp *?n Pro lie Ser Asn Tnr ten Ala Val Asn 

- • 55 60 

2 5 Gly cys Glu Gly Gly^ Ma Tyr Ala cys Thr Asn Tyr Ser P*o 

/U -7C- 



80 

Trp Ala Val Asn Asp Glu Leu Ala Tyr Gly Phe Ala Ala Thr Lys lie 
30 D 90 95 



Ser Gly Gly Ser Glu Ala Ser Trp Cys Cys Ala Cys Tyr Ala Leu Thr 

105 110 
35 ^ S ^ ^ ^ LYS S y LyS L ^ S Ile ^ Gin ser Thr 



105 110 

Lys Lys Met He Val 
120 125 

Asn Thr Gly Gly Asp Leu Gly Asp Asn His Phe Asp Leu Met Met Pro 

135 140 

155 160 



40 Gly Gay Gly Val Gly lie Phe Asp Gly cys Thr Ser Glu Phe Gly Lys 

155 160 
Ala Leu Gly Gly Ala Gin Tyr Gly Gly Iie Ser Ser Arg Ser Glu Cys 

4 5 170 175 

Asp ser Tyr Pro Glu Leu Leu Lys Asp Gly cys His Trp Arg Phe Asp 

so Trp Phe Glu Asn Ala Asp Asn Pro Asp Pne Thr Phe Glu Gin Val Gin 

^ U0 205 

Q/s Pro Lys Ala Leu Leu Asp He s^-r riv T 

210 P1 s lY ^ s Lys Arg Asp Asp Asp 

^ xo 220 

55 Ser Ser Phe Pro Ala Phe Lys Val Asn ^ r c ^ 

225 230 ^ ^ Ala Ser L ^ s ^ Gin 
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Pro Ser Ser Ser Ala Lys Lys Thr Thr Ser Ala Ala Ala Ala Ala Gin 
245 250 255 

5 Pro Gin Lys Thr lys Asp Ser Ala Pro Val Val Gin lys Ser Ser Thr 
260 265 270 

Lys Pro Ala Ala Gin Pro Glu Pro Thr Lys Pro Ala Asp Lys Pro Gin 
275 280 285 

10 

Thr Asp Lys Pro Val Ala Thr Lys Pro Ala Ala Thr lys Pro Val Gin 
290 295 300 

Pro Val Asn lys Pro lys Thr Thr Gin Lys Val Arg Gly Thr Lys Thr 
15 305 310 315 320 

Arg Gly Ser Cys Pro Ala Lys Thr Asp Ala Thr Ala lys Ala Ser Val 
325 330 335 

2 0 Val Pro Ala Tyr Tyr Gin Cys Gly Gly Ser lys Ser Ala Tyr Pro Asn 
340 345 350 

Gly Asn Leu Ala Cys Ala Thr Gly Ser lys Cys Val lys Gin Asn Glu 
355 360 365 

25 

Tyr Tyr Ser Gin Cys Val Pro Asn 
370 375 



